A Survey on the Methods Used in Document Digitization and its Applications

نویسندگان

  • Greeshmamol Varghese
  • Kumudha Raimond
چکیده

Document digitization and Document Analysis and Recognition (DAR) are techniques that are used for handling document images. Several techniques were implemented to perform document digitization. Article reconstruction is one of the main applications of document digitization. The four major steps of article reconstruction are grouping the article bodies, detecting the reading order, title body pair association and article parts linking scattered in different pages. This paper presents a survey on different techniques that are used for document digitization as well as for article reconstruction. Keywords— Document digitization, Document analysis and recognition, Article reconstruction, Data mining, Pattern recognition, Clustering

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

The perceptibility curve test applied to CCD and two methods of digitization of dental film-based radiographs

Objectives: Several methods of image acquisition are accessible in dentistry. There is no overall acceptable method for image digitization so all different types of images can be comparable. The objective of this study was to compare the diagnostic accuracy of different methods of image digitization. Methods: This accuracy diagnostic test study used perceptibility curve test which first intr...

متن کامل

A Survey on Document Image Analysis and Retrieval System

The digitization of documents and their availability over the network demands solution toward content based document image analysis, indexing, searching and retrieval. Signature, Logo and Layout of the documents present convincing evidence and provide an important form of indexing for effective document image retrieval in a variety of applications. This paper describes methods and techniques de...

متن کامل

A Survey on Practice and Challenges of Balanced Score Card in Higher Education Institutions: A Case study on Selected Public Universities in Ethiopia

The  purpose of this  study  is  to  assess  the  practice  and  challenges of BSC  encountered  by  public  higher education institutions   as  a strategic management  tool  in  implementing  their  strategic  plans. In this research, the researchers used  both  quantitative  and  qualitative  research  approaches  in  its  successful  accomplishment.  The  quantitative  frames  will  be  made...

متن کامل

روش جدید متن‌کاوی برای استخراج اطلاعات زمینه کاربر به‌منظور بهبود رتبه‌بندی نتایج موتور جستجو

Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014